mouse movement
Supplementary Information A Collecting Internet Data A.1 Initial Unclean Dataset Curation
Our goal was to curate a video dataset of Minecraft gameplay from the survival game mode. Minecraft Survival game mode that include such visual artifacts. Please help us identify screenshots that belong only to the survival mode in Minecraft. Survival mode is identified by the info at the bottom of the screen: a health bar (row of hearts) a hunger bar (row of chicken drumsticks) a bar showing items held Survival Mode V alid survival mode videos have health/hunger bars and an item hotbar at the bottom of the screen. Creative Mode Creative mode only has an item hotbar and should be classified as None of the Above .
- Leisure & Entertainment > Games > Computer Games (1.00)
- Leisure & Entertainment > Sports (0.70)
In-Application Defense Against Evasive Web Scans through Behavioral Analysis
Ousat, Behzad, Shariatnasab, Mahshad, Schafir, Esteban, Chaharsooghi, Farhad Shirani, Kharraz, Amin
Web traffic has evolved to include both human users and automated agents, ranging from benign web crawlers to adversarial scanners such as those capable of credential stuffing, command injection, and account hijacking at the web scale. The estimated financial costs of these adversarial activities are estimated to exceed tens of billions of dollars in 2023. In this work, we introduce WebGuard, a low-overhead in-application forensics engine, to enable robust identification and monitoring of automated web scanners, and help mitigate the associated security risks. WebGuard focuses on the following design criteria: (i) integration into web applications without any changes to the underlying software components or infrastructure, (ii) minimal communication overhead, (iii) capability for real-time detection, e.g., within hundreds of milliseconds, and (iv) attribution capability to identify new behavioral patterns and detect emerging agent categories. To this end, we have equipped WebGuard with multi-modal behavioral monitoring mechanisms, such as monitoring spatio-temporal data and browser events. We also design supervised and unsupervised learning architectures for real-time detection and offline attribution of human and automated agents, respectively. Information theoretic analysis and empirical evaluations are provided to show that multi-modal data analysis, as opposed to uni-modal analysis which relies solely on mouse movement dynamics, significantly improves time-to-detection and attribution accuracy. Various numerical evaluations using real-world data collected via WebGuard are provided achieving high accuracy in hundreds of milliseconds, with a communication overhead below 10 KB per second.
- North America > United States > Florida > Hillsborough County > University (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (3 more...)
- Research Report (1.00)
- Instructional Material > Course Syllabus & Notes (0.46)
Human-in-the-Loop AI for Cheating Ring Detection
Shih, Yong-Siang, Liao, Manqian, Liu, Ruidong, Baig, Mirza Basim
Online exams have become popular in recent years due to their accessibility. However, some concerns have been raised about the security of the online exams, particularly in the context of professional cheating services aiding malicious test takers in passing exams, forming so-called "cheating rings". In this paper, we introduce a human-in-the-loop AI cheating ring detection system designed to detect and deter these cheating rings. We outline the underlying logic of this human-in-the-loop AI system, exploring its design principles tailored to achieve its objectives of detecting cheaters. Moreover, we illustrate the methodologies used to evaluate its performance and fairness, aiming to mitigate the unintended risks associated with the AI system. The design and development of the system adhere to Responsible AI (RAI) standards, ensuring that ethical considerations are integrated throughout the entire development process.
Simulator-Free Visual Domain Randomization via Video Games
Trivedi, Chintan, Rašajski, Nemanja, Makantasis, Konstantinos, Liapis, Antonios, Yannakakis, Georgios N.
Domain randomization is an effective computer vision technique for improving transferability of vision models across visually distinct domains exhibiting similar content. Existing approaches, however, rely extensively on tweaking complex and specialized simulation engines that are difficult to construct, subsequently affecting their feasibility and scalability. This paper introduces BehAVE, a video understanding framework that uniquely leverages the plethora of existing commercial video games for domain randomization, without requiring access to their simulation engines. Under BehAVE (1) the inherent rich visual diversity of video games acts as the source of randomization and (2) player behavior -- represented semantically via textual descriptions of actions -- guides the *alignment* of videos with similar content. We test BehAVE on 25 games of the first-person shooter (FPS) genre across various video and text foundation models and we report its robustness for domain randomization. BehAVE successfully aligns player behavioral patterns and is able to zero-shot transfer them to multiple unseen FPS games when trained on just one FPS game. In a more challenging setting, BehAVE manages to improve the zero-shot transferability of foundation models to unseen FPS games (up to 22%) even when trained on a game of a different genre (Minecraft). Code and dataset can be found at https://github.com/nrasajski/BehAVE.
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Europe > Middle East > Malta > Eastern Region > Northern Harbour District > Msida (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Games (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.87)
Behavioural Cloning in VizDoom
Spick, Ryan, Bradley, Timothy, Raina, Ayush, Amadori, Pierluigi Vito, Moss, Guy
In recent years, DNNs have shown promising results This paper describes methods for training autonomous in the field of behavioural cloning (BC) [5, 18]. BC is a agents to play the game "Doom 2" through Imitation form of Imitation Learning (IL), where we train an artificial Learning (IL) using only pixel data as input. We also explore "agent" to mimic actions from an observable state of how Reinforcement Learning (RL) compares to IL expert data [34]. Agents are trained using a number of historical for humanness by comparing camera movement and trajectory states, be they image frames or other data, and their data. Through behavioural cloning, we examine the corresponding actions. The learning is performed by using ability of individual models to learn varying behavioural the final frame's associated action as the "target", this target traits. We attempt to mimic the behaviour of real players being passed to some loss function. The loss function will with different play styles, and find we can train agents that reinforce the observed frame's predicted action, doing this behave aggressively, passively, or simply more human-like over an extremely large dataset will achieve an agent that than traditional AIs. We propose these methods of introducing can predict the best action to take at any one given set of more depth and human-like behaviour to agents in video input image frames [17].
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
- Europe > Italy > Sardinia (0.04)
- Europe > Greece (0.04)
- Leisure & Entertainment > Games > Computer Games (1.00)
- Media (0.78)
- Information Technology (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- (2 more...)
Can an AI play Minecraft as well as humans?
Minecraft is a 3D sandbox game developed by Mojang Studios where players interact with a fully modifiable three-dimensional environment made of blocks and entities. It's diverse gameplay lets players choose the way they play, allowing for countless possibilities. OpenAI, the artificial intelligence research organization founded by Elon Musk, has trained an AI to play Minecraft almost as well as humans. It only took about 70,000 hours of binging YouTube videos. The internet contains an enormous amount of publicly available videos that we can learn from. However, these videos only provide a record of what happened but not precisely how it was achieved, i.e. you will not know the exact sequence of mouse movements and keys pressed.
AI learns how to play Minecraft by watching videos - AI News
Open AI has trained a neural network to play Minecraft by Video PreTraining (VPT) on a massive unlabeled video dataset of human Minecraft play, while using just a small amount of labeled contractor data. With a bit of fine-tuning, the AI research and deployment company is confident that its model can learn to craft diamond tools, a task that usually takes proficient humans over 20 minutes (24,000 actions). Its model uses the native human interface of keypresses and mouse movements, making it quite general, and represents a step towards general computer-using agents. A spokesperson for the Microsoft-backed firm said: "The internet contains an enormous amount of publicly available videos that we can learn from. You can watch a person make a gorgeous presentation, a digital artist draw a beautiful sunset, and a Minecraft player build an intricate house. However, these videos only provide a record of what happened but not precisely how it was achieved, i.e. you will not know the exact sequence of mouse movements and keys pressed. "If we would like to build large-scale foundation models in these domains as we've done in language with GPT, this lack of action labels poses a new challenge not present in the language domain, where "action labels" are simply the next words in a sentence." In order to utilise the wealth of unlabeled video data available on the internet, Open AI introduces a novel, yet simple, semi-supervised imitation learning method: Video PreTraining (VPT). The team begin by gathering a small dataset from contractors where it records not only their video, but also the actions they took, which in its case are keypresses and mouse movements. With this data the company can train an inverse dynamics model (IDM), which predicts the action being taken at each step in the video. Importantly, the IDM can use past and future information to guess the action at each step. The spokesperson added: "This task is much easier and thus requires far less data than the behavioral cloning task of predicting actions given past video frames only, which requires inferring what the person wants to do and how to accomplish it.
- North America > United States > California (0.06)
- Europe > Netherlands > North Holland > Amsterdam (0.06)
Counter-Strike Deathmatch with Large-Scale Behavioural Cloning
This paper describes an AI agent that plays the popular first-person-shooter (FPS) video game'Counter-Strike; Global Offensive' (CSGO) from pixel input. The agent, a deep neural network, matches the performance of the medium difficulty built-in AI on the deathmatch game mode, whilst adopting a humanlike play style. Unlike much prior work in games, no API is available for CSGO, so algorithms must train and run in real-time. This limits the quantity of on-policy data that can be generated, precluding many reinforcement learning algorithms. Our solution uses behavioural cloning -- training on a large noisy dataset scraped from human play on online servers (4 million frames, comparable in size to ImageNet), and a smaller dataset of high-quality expert demonstrations. This scale is an order of magnitude larger than prior work on imitation learning in FPS games.
Automatic Player Identification in Dota 2
Yuen, Sizhe, Thomson, John D., Don, Oliver
Dota 2 is a popular, multiplayer online video game. Like many online games, players are mostly anonymous, being tied only to online accounts which can be readily obtained, sold and shared between multiple people. This makes it difficult to track or ban players who exhibit unwanted behavior online. In this paper, we present a machine learning approach to identify players based a `digital fingerprint' of how they play the game, rather than by account. We use data on mouse movements, in-game statistics and game strategy extracted from match replays and show that for best results, all of these are necessary. We are able to obtain an accuracy of prediction of 95\% for the problem of predicting if two different matches were played by the same player.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Utah (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Europe > Norway > Central Norway > Trøndelag > Trondheim (0.04)